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QUALITATIVE INEQUALITIES FOR SQUARED PARTIAL CORRELATIONS OF A 

GAUSSIAN RANDOM VECTOR 

SANJAY CHAUDHURI 


Abstract. We describe various sets of conditional independence relationships, sufficient for qualitatively 
comparing non-vanishing squared partial correlations of a Gaussian random vector. These sufficient condi¬ 
tions are satisfied by several graphical Markov models. Rules for comparing degree of association among the 
vertices of such Gaussian graphical models are also developed. We apply these rules to compare conditional 
dependencies on Gaussian trees. In particular for trees, we show that such dependence can be completely 
characterised by the length of the paths joining the dependent vertices to each other and to the vertices 
conditioned on. We also apply our results to postulate rules for model selection for polytree models. Our 
rules apply to mutual information of Gaussian random vectors as well. 


1. Introduction 


In graphical Markov models literature, several attempts have been made to characterise the degree of 
conditional association among the vertices by the str ucture o f the underlying graph. Such knowledge is 
considered useful in model selection. For example, Cheng et al ( 2002h describe an algorithm of model 
selection for directed acyclic graphs (DAG) which assumes that the mutual inf ormati on has a mo notone 
relationship with certain structure based length of the path. Examples ( Chickering and Meed . [200(f) show 
that such a monotone DAG faithfulness property or a similar compound monotone DAG faith f ulness prope rty 
do not hold even fo r sim pl e bina ry DAGs. In fact, except in some specific cases e.g. Greenland (120031 ) in 
epidemiology. [Spirtes et al ( 2000l causal pipes) in causal analysis, no result is known in this context. 

A more general problem is to order the squared partial correlation coefficients among the components 
of a Gaussian random vector. For these random vectors, squared partial correlation coefficients completely 
measure the degree of association between its components conditional on a subset of the components. This 
measure is a polynomial in the entries of their covariance matrices. Thus in many situations it is beneficial 
to be able to order squared partial correlation coefficients in a way, such that the ordering does not depend 
on the specific values of the covariances. 

Simple counter-examples show that such qualitative comparisons cannot hold unless the covariance matrix 
belongs to certain subsets of positive definite matrices. In this article, we specify such subsets by conditional 
independence relationships. For a graphical Markov model validity of such relationships can be simply read 
off from the underlying graph. Thus rules for comparing degree of association on various Gaussian graphical 
models can be developed. 

In this article we show that, certain conditional independence relationships holding, suitable squared par¬ 
tial correlations can be qualitatively compared. We make two kinds of comparisons. In the first, the set 
of components conditioned on (conditionate) are kept fixed and we change the dependent vertices (corre¬ 
lates). More importantly, in the second, we fix the two correlates and compare their degree of dependence 
by varying the conditionates. The sufficient conditional independence relationships are satisfied by several 
graphical Markov models. Using relevan t separation criteria (e. g. separation for undirected graphs (UG) 
(see Definition [T| , d-separation for DAGs dVerma a nd Pearl 1990 ) ( s ee De finition |4l), m-separation for mixed 
ancestral graphs (MAGs) (see supplement') (iRichardson and Spirtes . l2002h etc., we postulate sufficient struc¬ 
tural conditions for comparing conditional association on them. We emphasize that the specific graphical 
Markov models are used as illustrations. Our results apply to a much wider class of models. Furthermore, 
using the fact that for tree and polytree (DAGs without any undirected cycles either or singly connected 
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directed acyclic graphs) models, any two connected components have exactly one path joining them, these 
structural criteria can be simplified to path based rules for comparison. We discuss such rules for trees in 
details, where it is also shown that our rules for comparing the squared partial correlations are complete. 

The inequalities discussed here have theoretical interest as new properties of Gaussian ra ndom v ectors 
and directly translate to corres p onding condit ional non-Shannon type information inequalities ( Matusl . 20061 
2007 : Zhang and Yeun d . 1997 ). Matusl ( 20051) considers implications of one set of conditional independence 
relations on other conditional independencies for Gaussian random vectors. Furthermore, he describes a 
way to determine such implications using the ring of polynomials generated by the entries of the correlation 
matrices with some additional indeterminates. Our results describe some polynomial inequalities these rings 
satisfy. 

Our main motivation comes from the Gaussian graphical Markov models. These results are canonical 
and s ufficient to postu la te structure based rules to order dependencies on several of them. We improve 
Chaudhuri (I2005h : Chaudhuri and Richardsonl ( 2003), who only consider poly tree models. These re- 


upon njnauanurii iizuuoii ; lunauanuri ana racnarasoni nzuuai. wno only consider poiy t 

suits can be used in dete r mining the distortion effects (|Wermuth and Coxl . 120081) and monotonic effects 
( VanderWeele an d Robing. 2007|. 20101) o f confounded variables in epidemiology and causal network analy¬ 
sis (see also I Greenland and Pearl ( 20111) ). We postulate necessary and sufficient conditions for determin¬ 
ing structures on a class of polytree models. These conditions can be directly applied in model selec¬ 
tion ,_s 2 ecially in mapping rive r flow and drainage networks where such polytree models occur naturally 
( Rodriguez-Iturbe and Rinaldol . l200ll) . In real data analysis, these inequ a lities would be useful for model 
selection, specially among various graphical Markov models ( Cheng et all . l2002t IShimizu et all . 120061) . For 
these models our results would translate to hypothesis connected to the structure of the graph. These hy¬ 
pothesis can be tested from the observed data. Structure based inequalities may also be used as constraints 
in estimation with missing values. They are also relevant in choosing prior distributions in Bayesian proce¬ 
dures. The qualitative bounds can be used in selecting stratifying variables in designing surveys, gathering 
most relevant information in forensic sciences and building strategies for constrained searches. Further, these 
results may have applications in designing effective updating and blocking strategies in Gibbs sampling and 
Markov chain monte carlo procedures (see eg. iRoberts and Sahul ( 19971) etc). 


2. Squared partial correlation inequalities 

Suppose V ~ N (n, E) with a positive definite E. Let a, b, c, d, z , z' , x etc. be the components and 73, 
Z etc. be the subsets of components of V. In this article V will also denote the vertex set of the underlying 
graph (see supplement for more details). Let 0 denote the empty set. 

The squared partial correlation coefficient (p 2 ac \ z ) between a and c conditional on Z is defined by: 


( 1 ) 


2 

rac\Z 


(&ac ^ , a Z^ l ZZ & c z') 


= 1 — e 


-2Inf(alLc\Z) 


(<raa Z, a z'Z zz a a z) (cr 0 o Z, c zZ> zz a c z) 

Here a a b and T, a z res pectively d enote the (a, 6)th element and a x Z submatrix of E. Inf (a _LL c\Z) is 
the mutual information (jWhittakerl . I2008L information proper) of a and c given Z. From (JTJ) it follows that 
the mutual information is a monotone increasing function of the corresponding squared partial correlation. 
Thus the qualitative inequalities for p 2 ac \ z presented below applies to Inf (a _LL c\Z) as well. 

2.1. Comparing conditional dependence with a fixed conditionate. We hist fix a subset Z to be 
conditioned and one correlate a. The squared partial correlation is compared by changing the other correlate 
from doc'. 

Theorem 1. Suppose c' _LL a\cZ, then p 2 ac ,\ z — Pac\z- 

Theorem [T] is a conditional version of the well-known information inequality ( Cover and Thomasl . 20061) 


and holds in general for mutual information of any distribution. For graphical Markov models the condition 
holds if c' is separated from a given c and Z. Further, for trees the condition is satisfied if c lies on the path 
joining a and c!. Thus longer path implies weaker dependence in this case. 
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Figure 1. |l(a)|A polytree, Z = {z lt z 2 , z 3 , z 4 }, p 2 C3]z < p 2 aC2 jz , however p 2 ac ^ z 
may not hold. |l(b)| an UG satisfying the conditions of Theorem 0 p 2 c > p 2 c , > p 
plx > pL\ z > P 2 ax \ z " Further, from Theorem 0 p 2 ac < p 2 ax , p 2 ^ < p 2 ax[z and p 2 ac]z , 
Exactly the same conclusions hold on the DAG in 1(c) 


< P 


2 

aci | Z 


lc\z> and 
- P 2 ax\z'- 



Figure 2. Graphical models satisfying the conditions of Theorem [3j In each graph a _LL c. 
The graph in 2(a) is a polytree. Here B = { 61 , 62 } and p 2 


< n z < 

_ „ cl B — r ac\Bz' — ' ac\B.. 

| 2 (b)| it follows that p 2 , < p 2 ac \ x (cf. IWermuth and Gm3 (2008)). The graph in 2 (c 


holds. In 


is a 


mixed ancestral graph ( Richardson and Snirtesl . 2002 1 where p 2 c \ z , < p 2 c | 2 always holds. 


For polytree models the condition depends on the arrangement of the arrows on the path joining a, c 
and c'. The condition is satisfied if two arrowheads do not meet at c on the path joining a and c ', (ie. 
c is not a collider on the path joining a and c see Definition [3]) . As for example, in Figure [l(a)| with 
Z = {zi, z 2 , z 3 , Z4}, using the d-separation criterion (see Definition [4j we get, c 3 _LL a\Zc 2 - Theorem 0 
ensures that p 2 aC3 \ z < P 2 aC2 \z■ The same d-separation criterion however implies that c 3 JtL al^ci, so there is 
no guaranty the p 2 aci \ z wo uld be larger than p 2 . z . T his partially justifies the intuitive argument given in 
Greenland ( 2003 1 (see also Greenland and Pearl ( 201lll h 


2.2. Comparing conditional dependence with fixed correlates. Here two components a and c of V 

are held fixed. We consider the variation in p 2 c \ z for different subsets Z of V. Depending on the nature of 
pairwise unconditional association between a, c and the sets conditioned on, three situations may arise. 

2.2.1. Situation 1. The components a, c, z and z' are unconditionally pairwise dependent. 
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Figure 3. Graphical models satisfying the conditions of Theorem HJ Each model satisfies 
the condition (i) of the theorem. |3(a) is a polytree on which aczz' _LL { 6 i, 62 }|a; holds. In 
|3(b)| ac _LL b\x, but ac JL b\zx. In |3(c) ac _LL b\zx but ac JtL b\x. From Theorem[I]it follows 
that p 2 , „ < p 2 , „ , < p 2 

“ ac\B — rac\Bz' — ^ a 


c\Bz m 


Theorem 2. Suppose for some x, a _LL c\x and ac _LL z\x. Then p 2 ac t z < p 2 c ■ In addition, if ac _LL z'\z, then 
P ac\z — P ac\z' — Pac‘ 

The conditions of Theorem [2] can be represented by several graphical Markov models, eg. undirected 
graphs, directed acyclic graphs etc. The conditional independence conditions imply that a, c and 2 have to 
be pairwise separated given x and z' has to be separated from a and c given 2 . 

The first part shows that under these conditions the dependence of a on c always reduces on conditioning. 
For tree and polytree models the conclusion of the second part can be intuitively explained. Notice that, by 
assumption p 2 c > p^ , = 0 and the separation criteria imply that z' is farther away from x than 2 . Thus z' 
has less information about x than 2 . So p 2 ac should be closer to p 2 ac than p 2 ac , . In other words, conditioning 
on the vertices farther away from the path between a and c increases the degree of association. 

2.2.2. Situation 2. The correlates a and c are independent, but both are dependent on the sets conditioned 
on. 


Theorem 3. Suppose a _LL c and for some x, the condition ac _LL zB\x holds. Then p 2 ac \ B < p 2 c \ Bz 
Moreover, if z' _LL acB\z holds, then p 2 c ^ B < p 2 ac \ Bz , < p 2 c \ Bz . 


By assumption 0 = p 2 c < p 2 ac \ B . Thus the first conclusion implies that conditioning on a larger set implies 
stronger association. On an UG, the condition a _LL c implies that a and c cannot be connected. Thus UGs 
are not useful to represent the conditions in Theorem [3] They are satisfied by several other graphical Markov 
models like DAGs, MAGs etc. 

For polytree models (See Figure [2 (a) [ ) the conclusions of Theorem [3] can be intuitively explained as well. 
As before, one can conclude z' is farther away from x and therefore has less information about x than 2 , 
Pac\x ^ 0 bed p 2 c = 0. Thus by the same argument as for Theorem [2] conditioning on B and z' should 
produce weaker association than B and 2 . 

In the graph in Figure 2(b) the marginal covariance matrix of a, c, x and y satisfy the conditions of 
Theorem [3) Thus, p 2 rUl < p 2 rlx . The graph in Figure 2(c) is a mixed ancestral graph (notice the O edge 

and 


" ''■!» — pp - 

between y 1 and y -2 (IRichardson and Spirtesl . 120021 1). Here the marginal covariance matrix of a, c, X 2 , 


2 ' would satisfy the conditions of Theorem [3] (see Appendix iBl). So we conclude that p 2 ac \ z , < p 2 


ac\z 


< p\ 


ac\x2' 


2.2.3. Situation 3. At least one of a and c is independent of both the sets conditioned on. 


Theorem 4. Suppose a AL z. Let for some x, E satisfies one of the following two ((*), ( ii )) conditions: 

( i ) c AL az and one of the following six conditions (a) az _LL B\x, ( b) az _LL B\cx, (c) cz _LL B\x, (d) 
cz _LL B\ax, (e) ac _LL B\x and (/) ac _LL B\xz holds, 

(ii) az _LL cB\x. 
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Figure 4. Graphical models satisfying the conditions (ii) of Theorem [4] In each graph the 
condition az _LL cB\x holds. In Figure [4(b )] az _LL {b±, b 2 }\cx also holds. The graphs in 4(a) 


(B = { 61 , 62 , 63 }) and |4(b)| (B = { 6 2 , 6 3 }) are polytrees. On each p 2 ac|B < p 2 ac \ Bz , < p 2 ac{Bz 


hold. 


Then Plc\B < Plc\Bz 


. Further, if z' _LL acB\z holds, then in both cases, p 2 ac \ B < p 2 c 


I Bz 


— Pa 


;| Bz 


The difference between the conditions (?) and (ii) in Theorem [4] is illustrated in Figure 3(a) and 4(a) 
Under condition (i), c _LL z but the relation c f/L z\x does not necessarily hold . On the other hand, under 
condition (ii), c _LL z\x but c may not be independent z unconditionally. 


The six conditions in (i) are in general distinct. As for example, from m-connection rules ( Richardson and Spirtes 
2002) the MAG in Figure [3(b) | we get (note the paths (a, c)<->x4->z4-tb) ac _LL 6 |a; but ac JtL b\zx (see supple¬ 
ment). On the other hand on the DAG in Figure 3(c) clearly ac _LL b\zx but ac fL b\x. Similar examples for 
other four conditions can be drawn. 


Theorem Q] goes beyond the DAGs considered by Ichaudhuri and Richardson! ( 2003 ). One example is 

Here a _LL c, ac _LL z and both ac _LL 6 |x and ac _LL b\xz holds. Consequently, 


considered in Figure 5 (a 


from Theorem SI the relationship p 2 ac , b < p 2 ac \ bz , < p 2 n r \ bz follows. Note that z is not an ancestor of x but 
an ancestor of 6 and consequently, zz' _LL x also holds. Chaudhuri and Richardson ( 200 31 explicitly exclude 
conditioning vertices which are independent of x. 

Corollary 1. If B = 0, Under all conditions of Theorem 0(0, P 2 ac\z = P 2 ac\z' = pIc = Under condition 


(H), P ac \ z > P ac \ z , > p'L- 


2.3. Comparison between Theorems [2] and |4] for polytree models. For polytree models, in view of 
Theorem (2) the conclusion of Theorem [I] (ii) is a bit counterintuitive. Note that, under (ii), p 2 ac , x = 0, 
which is same as in Theorem [2l However, unlike the latter, conditioning on vertices farther away produce a 
weaker squared correlation in this case. The difference seems to be that in Theorem[2]a JtL z, but we assume 
a _LL z\x. In contrast, Theorem |4] assumes that a _LL z, but in (ii), the condition a _LL z\x does not hold. As 
an illustration of this contrast we consider the graph in Figure 5(b) From Theorem [2] and Corollary [T] it 
follows that the relationship p 2 , > p 2 , > p 2 > p 2 , > p 2 , > p 2 

1 1 ac\v — ' ac\u — ' “ac\y — ' ar.\w — • 1 


J ac\w 


ac\x 


= 0 holds. 


Another such example can be constructed from the DAG in Figure 5(a) We have argued above that from 
Theorem 0] it follows that p 2 ac | fc < p 2 c | b „ ; < p 2 ac \ bz - In the DAG in Figure 5(c) the relation a _LL c has been 
replaced by a _LL c|x. From the rules of d-separation a _LL c| 6 x, ac _LL z\bx and acb _LL z'\z (see Definition [4j . 
Thus after conditioning on b , the Covariance matrix of a, x, c, z and z' satisfies the conditions of Theorem 


[2j So the qualitative comparison holds, but in contrast to Figure 5(a) it follows that p 2 c ^ b > p 2 ac \ bz , > p 2 ac \ bz - 


2.4. Comparison between p 2 ac \ x 
= 0. When x G 


and 


r ,2 

P ac\Bz 


P ac\Bz " 


If z = x, in Theorem [|] in all case a _LL Bcz , so p 2 c \ z , 


= P, 


ac\B 


V \ z, comparison between p 2 ac \ x 


and p 2 ac j Bz does not directly follow from 
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Figure 5. |5(a)| a DAG not considered by IChaudhuri and Richardson ( 2003 ). From Theo¬ 
rem [4] it follows that p 2 ac \ b < p 2 ac , bz , < p 2 ac \ bz - |5(b)| A DAG to illustrate the contrast in the 
conclusion of Theorem [2] and Theorem [4] (ii). Here p 2 , > p 2 , > p 2 ac > p 2 , > p\ 


> 


p 2 , =0. From Theorem El on the DAG in 5(c) it follows that p 2 ,, > p 2 ,, , 

~ ac\x '— 1 \ ' ' ac\b — ~ ac\bz 

always hold. 


> P\ 


ac\bz 


Theorem [4] Under condition (ii), a _LL c\x , 0 = p 2 c \ x < p 2 ac \Bz f° r an y z - However, under the conditions (i), 
Pac\x an< ^ Pac\Bz ma y n °t qualitatively compared. We show this fact in the following theorem. 


Theorem 5. Suppose a _LL z, c _LL az, and acz _LL B\x, then p 2 


l|B; 


>Pt 


xz 
v zz 


> a 2 x , or equivalently 


' ac\x 

O XX O’: 


, iff 

xx\B 


> 


z\B 


G. 


XX 


Theorems [2] [3j 0] and [5] have a curious implication on polytree models. Notice that in Theorems [2] and [3] 
the vertex z is in the set of descendants of vertex x (see Figures 1(c) and |2(a)| , whereas in Theorem[4l z may 
be a parent of x. The curious fact is that, on a polytree the squared partial correlations given the descendants 
of x cannot be compared with the squared partial correlations given the parents (or more generally given 
the ancestors of the parents of x). Furthermore, the behaviour of p 2 ac \ x is a continuation of the behaviour of 
squared partial correlations given its descendants. In other words, on polytrees, conditioning on the vertices 
“above” the path has different nature than conditioning on the vertices “below” or “on” the path. 

We present an illustrative example in Figure [ 6 j We consider the poly tree in Figure | 6 (a)| In Figure [ 6 (b)] 
we plot the values of p 2 lc u for i £ {0, Z4, Z3, Z2, z±, x, y\, 2/2 , 2 / 3 , Va}- All parameter values are fixed at 1. As 
predicted from Theorem [4] the squared partial correlation increases from i = Z4 to * = z\ and from Corollary 
|T]each of them are larger than p^ c . However, From Theorem [3j p 2 ac ^ increases as we move from x to 1/4 and 
each of them are smaller that p 2 c . Thus the squared partial correlation drops discontinuously as we move 
from z\ to x along the z4 to 2/4 path. 

2.5. Further generalisations on comparison with fixed correlates. Suppose Z\ = {z n, Z\ 2 ,... , z\ n } 

and Z2 = {Z21, Z22, ■ ■ ■, Z2n} are two conditionates of cardinality n. Then for fixed correlates a and c, one 
can write: 


ac|Zi 


n 


Pac\z 2 l,Z 22 ,-- .>22(4-1) ,Zli, 2 l(i+l),...>Zl»> 


(2) _ _ . 

Pac\Z 2 i— 1 Pac\z 2 l,Z 22 ,---,Z 2 (i-l),Z 2 i,Zl(i+l),---,Zln 

Clearly p 2 ac \ Zl < P 2 ac \z 2 l 10 ^s if each factor in the R.H.S. of © is bounded by 1. 

Note that in each factor in ([2j) the conditionate in the numerator and the denominator differ only in 
one element. Thus in order to qualitatively compare p 2 ac \ Zl and p 2 c \ z fft is sufficient to find a Xi for 
each factor such that zu and Z 2 i satisfy the conditions of one of the Theorems 2-4, possibly with 
B c {Z21, Z22 , ■ ■ ■, - 2 i(i+i), ■ ■ •, Zin} whenever necessary. 

Using the factorisation in ([2]) and Theorems 2-4, structural and path based rules for comparison may be 
postulated for several graphical models. The choice of Xi and these path based rules depend on the structure 
of association of the whole vector V. We consider the tree models below. 
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Figure 6. | 6 (a)| A polytree and | 6 (b)| the value of p 2 ^ for i £ {0, z 4 , z 3 , z 2 , z 1; x, 2 / 1 , 2 / 2 , 2 / 3 , 2 / 4 }- 
Each parameter is fixed at 1. | 6 (b)| illustrates the discontinuous drop in p 2 ac ^ as we move 
from z\ to x along the z 4 to y 4 path. 


3. Application to tree models 


Let G = (V, E) be a tree with vertex set V and edge set E. For vertices x £ V and y £ V, x 7T y denote 
the unique path joining x and y , which we define as: 

x TT y = {x = vi, v 2 ,..., Vk-i ,Vk = y such that there is an edge between 

Vi and Vi+\, for each * = 1,2 ,..., k — 1}. 

Notice that, by the above definition x n y is a subset of V which contains the end points x and y. Since 
G is a tree, it has only one connected component and therefore any two vertices x and y are connected by 
an unique x 7T y . 

Definition 1. Two vertices a and c on an undirected graph G is said to be separated given a subset Z of 
V \ { a , c} if each path 7r between a and c intersects Z. Two subsets A and C of V are separated given 
Z C V \ {A U C) if Z separates each a £ A from each c £ C. Two subset A and C of V are connected given 
a subset Z if they are not separated given Z. 


Clearly on a tree a and c are separated given each x £ a 7T c \ {a, c}. On the other hand since any two 
vertices a and c are connected by an unique path, a and c cannot be separated given the 0 . 

The separation criterion described above associates a set of conditional independence relations with G. 
This set is described by a collection of triples. 

(3) 3 (G) = {(Ti, T 2 I T 3 ), where TiUT 2 UT 3 C V such that T x JL T 2 |T 3 } . 

The association of the separation criterion with 3 (G) can be described as follows: 

(Ti,T 2 I T 3 ) 4=> Ti is separated from T 2 given T 3 in G. 

If V ~ N (0, E), then E satisfies all conditional independence relationships in 3(G). This implies that if 
A = E-\ for each {T u T 2 \ T 3 ) e 3 (G), A Ti t 2 = 0. 

We now define formal operation of conditioning for independence model 3 (G), on subsets of V. 


Definition 2. A n independence model 3 (G) after conditioning on a subset Z is the set of triples defined as 
follows: 

(4) 3 (G) [ Z = {(Ti, T 2 | T 3 ) I (T 1 ,T 2 |T 3 UZ)e3 (G); (T, U T 2 U T 3 ) n Z = 0}. 


rZ 

Thus if 3 (G) contains the independence relations satisfied by a N (0, E) on G, then 3 (G) constitutes 

the subset of independencies holding among the variables in Z c = V \ Z, after conditioning on Z. Let Gz= 
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be the subgraph of G with vertex set Z c and edge set consisting of all edges in E between the vertices in 

rZ 

Z c . The following Lemma makes the connection between 3 ( G ) and 3 ( Gz <=)■ 


Lemma 1. Suppose G = (V, E ) is a tree. Let a, c be two distinct vertices, Z C V \ {a, c} and Z c = V \ Z. 
Then 

(5) 3(G) f Z = 3 (G Z c). 


Lemma [L] holds for any UG. It implies that the conditioning on Z does not add or delete any edge in Gz <=, 

rZ 

so if G is tree 3 ( G ) can be represented by a forest. The inverse of conditional covariance matrix of Z c 


given Z is simply A z <=z c - 

Separation ensures conditional independence, but if even if the separation fails the corresponding con¬ 
ditional covariance can still be zero (implying conditional independence for Gaussian random variables) 
because of the parameter values. However, Theorem [2] is still valid in these cases. 

For a fixed conditionate the rules for comparing squared partial correlations on trees follows easily from 
Theorem [l] and the separation criterion. 


Theorem 6. Suppose that, on a Gaussian tree G, the vertices a, c, d are such that c £ a 7T c /. Then for any 
Z — V> Pac'\Z - Plc\Z- 

For fixed correlates a and c and two sets Z\ and Z 2 of cardinality more than one, p 2 ac \ Zl and p 2 c \ Z2 can 
be compared qualitatively. The following result describes a sufficient condition. 

Theorem 7. Let G = (V, E ) be a Gaussian tree. Suppose a and c are two vertices on G and Z\ and Z 2 are 
two subsets ofV such that ac _LL Z 2 \Z\. Then p 2 ac \ Zl < Pac\z 2 - 

From the separation criterion described above, it follows that the vertices a and c separated from Z 2 
given Z\ implies ac _LL Z<]\Z\ and therefore p 2 c \ Zl < p1c\z 2 - The following Corollary gives the corresponding 
sufficient condition in terms of paths: 


Corollary 2. Suppose Z\ and Z 2 are two subsets of V, such that for each vertex Z 2 £ Z 2 , the both paths 
a % Z2 and C 7T Z2 intersect Z x , then p 2 ac{Zi < 

Notice that, Theorem [7] is more general than Corollary 0 the Theorem covers the cases when the condi¬ 
tional independence holds due to the choices of parameters as well. The result in Theorem[7]is also complete 
in the following sense. 

Theorem 8. Suppose G = ( V ., E) is a Gaussian tree. Let Z\, Z 2 C V such that ac JL Z 2 \Z\ and ac JL Zi\Z%. 
Further, suppose that (Z\ U Z 2 ) D a 7T c = 0. Then there exists £1 such that p 2 ac \ Zi > P 2 ac \ Z2 an d ^2 such that 

Pac\z 2 > Pac\Z i ' 

Finally, Theorem [S] and the Corollary [5] can be combined to a general rule for comparing squared partial 
correlation on trees. 


Corollary 3. Suppose a, c, d are three vertices on a Gaussian tree G and Z, Z' are two subsets of the 
vertex set V. Further, assume that c £ a 7T c / and the vertices a and d are separated from. Z given Z'. Then 

n 2 < fP 

P ac'\Z' — Pac\Z' 

4. Application to polytree models and model selection 


A polytree is a DAG such that if we substitute all its directed edges with undirected ones, the resulting 
graph (ie. its skeleton) would be a tree. Thus on a polytree two vertices x and y can have at most one path 
x H y connecting them. Here, on a connecting path we disregard the direction of the individual edges. 

A vertex y is an ancestor of a vertex x, if either y = x or x can be reached from y by following the 
arrowheads of a directed path (ie. the path y —> v\ —> V 2 —■* Vk —> x exits). The collection of all ancestors 
of x is denoted by an(x). Furthermore, for a set of vertices X we define an(X ) = U xe x<in(x). 

Theorem 9. Suppose that on a Gaussian polytree a ^ c ^ b, a £ an(c ) and c £ an(b). Further let, for some 
vertex z, p 2 ac{bz ± p 2 ac[b . Then 







Zu Z \2 

Z21 Z22 

a — 

| 

c - 
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\ 

Z31 

\ 

Z32 

\ 

Z33 

(c) 

Z34 

(a) 

(1 C -—*■ 0 

(b) 


Figure 7. Examples of polytrees satisfying the conditions of Theorem [9] below. In each, 
a G an(c ) and c G an(b). In |7(a)| zn and z i2 satisfy condition 1. and p 2 ac | b < p 2 ac \ bzi2 < p 2 ac \ bzil 
(from Theorem!] (ii)). In |7(b)|z 2 i and z 22 satisfy condition 2.. So p 2 ac \ bZ21 < p 2 acjbZ22 < p 2 ac \ b 
(see Theorem [5] and Figure [5(cJ| ). Each z^k, k = 1,..., 4, in 7(c) satisfy condition 2., ie. 
Pac\z 3k < Pac\b- Note that, b cannot be in an(z), otherwise ac _LL z\b and p 2 c , b = p 2 c i bz - 



Figure 8 . An illustration of the re sults i n Theorem !] on the river network of Avon river, 
Hampshire, England (obtained from Jarvie et al (|2005l )b 


(!) Pac\bz > Pac\b’ iff a AL z and c JL z. 

( 2 ) P 2 ac\bz < P 2 ac\b iff either cALzoraJLz- 

The condition p 2 c \ bz ^ p 2 ac \ b is required in Theorem [9] This implies ac JL z \ b. So b <0 an(z). It can 
further be shown (see the proof) that the polytree structure implies ac AL z iff c AL z. Thus the right hand 
side of Condition 2. above equivalently means that either both a and c are independent of z or none of them 
are independent of z. Examples of graphs satisfying the conditions 1. and 2 . can be found in Figure 0 

Theorem [9] has app l icatio ns in model selection. An example occurs in the mapping of river flow networks. 
Figured] ( Jarvie et all . 20051) presents a schematic diagram of the network of the Avon basin in Hampshire, 
England. Suppose that it is known that none of the rivers involved have a distributary. Clearly the network, 
with the direction of the water flow form a polytree. Measurements can be taken at points a (Netheravon), b 
(Christchurch), c (Amesbury), d (Downstream of Salisbury STW), e (Longford) and z (Chitterne). However, 
because of practical considerations we suppose that the measurements are taken when the water level at 
Christchurch ( 6 ) touches certain levels. Lets assume p 2 x i b 7 ^ p 2 x \ bz for x = c,d,e. We want to know where 
does the stream from z, ie. Chitterne meets river Avon. 

It is clear that since the observations are all conditional on the water level at 6 , in the data neither z JL a 
nor z JL c. However, from Theorem [2] see also Figure [5(c)] and Theorem |4] it follows that p 2 c \ bz < p 2 c \ b , 

Pad\bz > Pad\b an< ^ Pae\bz > Pae\b ■ F rom Condition 2. of Theorem [9] it follows that either both a and c are 
independent of z or none of them are. On the other hand, Condition 1. implies that a AL z but d and e 
are not independent of z. If none of a and c are independent of z, the point z must be on a distributary 
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Plot of the Squared correlation as p varies 



x a 




Figure 9. Plot of p 2 c ^ z and p 2 x i z for the graph on the left with /3 2C and t 2 . 

stream or on a tributary which meets Avon north of a (Netheravon). However, by assumption there is no 
distributary stream. Furthermore, if the tributary from z meets Avon somewhere north of a, by Theorem [2] 
both p 2 d | 6z < p 2 ad | 6 and p 2 ae | 6z < p 2 e \ b must hold. This is a contradiction. Thus ac _LL z must hold. So from 
Theorem [9] we see that the stream from Chitterne ie. z meets Avon somewhere between Amesbury ie. c and 
Downstream of Salisbury STW ie.d. 

5. Necessity of the conditional independence relationships 

In the above sections we postulated some sufficient conditional independence relationships under which 
some squared conditional correlations can be qualitatively compared. It is not known if these relationships 
are necessary as well. It is possible that qualitative comparison would hold under different sets of conditions. 
However the conditions in any set of relationships cannot be reduced. In this section we show this fact using 
various counterexamples. 

In each counter-example, unless otherwise stated, set all parameters ie. the regression coefficients and the 
node specific conditional variances are set to 1. 

5.1. Comparison with a fixed conditionate. We consider the graph in Figure [9j Note that, c is a collider 
on the a lT x and z is a child of c. Thus, from the laws of d-separation x is not d-separated from a given 
c and 2 . Under our choice of parametrisation clearly a JL x \ cz. In the plots to the right of Figure [HI we 
change respectively (3 CZ and t 2 and keep other parameters fixed. It is clear from the plots that p 2 ac \ z and 
p 2 c \ z cannot be qualitatively compared. This shows the condition of Theorem |T] cannot be relaxed. 
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5.2. Comparison with fixed correlates. We only consider the necessity of the conditions of Theorems [5] 
and [3] here. The examples for Theorem [4] are similar. 

The graphs and the plots used in the counterexamples are described as follows. In Figures [TO] and |TT] 
the graphs with solid edges satisfy the assumptions of Theorems [5] and [3] respectively. We consider the 
graph with the dashed edges. However, excepting one such edge, for all others their corresponding regression 
coefficients are set to zero. Each edge implies violation of one conditional independence relationship. 

The plots are interpreted as follows. The title of the plots describe which regression coefficients are set 
to zero. The other regression coefficient is changed and the values of the conditional and unconditional 
regression coefficients are calculated. 


5.2.1. FioureWH The graph with only the solid edges satisfy the conditions of Theorem[5] If an edge between 
a and c is added, ie. if (3 ac ^ 0, but /3 ZlC = (3 Z2a = 0, a _LL c\x no longer holds. Figure [l0(b)| shows that 
none of p 2 ac , p 2 ac ^ and p 2 ac ^ 
condition of Theorem [2] So we get p 2 c . t 

If we set /3 ac = P Z2a = 0 and allow /3 ZlC to vary, then for non-zero values of /? 2lC the condition ac _LL zijx 
is violated. So in figure 10(c) we see that, the concerned squared partial correlation coefficients are not 


2 ~ 2 and p 2 rlzo can be qualitatively compared. Note that, when /3 ac = 0 the graph satisfies the 

< plc\z 2 < pic as predicted. 


comparable. 

When /3 ac = /3 ZlC = 0 and (3 Z2a varies, the condition Z 2 -LL acx\z\ is potentially violated. The condition 
Z 2 -LL x\z\ is not required for Theorem [2] but for most graphical Markov models Z 2 -LL ac\z\ would imply this 
condition. Figure [T0(d)| shows that the squared correlations cannot be qualitatively compared in this case 
either. 

The above examples show that none of the conditions of Theorem [2] can be relaxed further. 


5.2.2. Figure [771 In this figure the graph with solid edges satisfy the conditions of Theorem [3] If (3 ca ^ 0 
then the assumption that a _LL c is violated. As it is evident from the plot in Figure [Il(b)| r/ts( 7 ac^i, rhsqacz 2 
and p 2 ac | x cannot be qualitatively compared. 

If f3 ZlC ^ 0, c zi \x and from Figure [Ti(c)| it is seen that the squared correlations cannot be qualitatively 
compared either. 

Finally, when /3 Z2C 7 L 0, Z 2 becomes conditionally dependent on c given z\. From Figure [Tl (d)| we once 
again conclude that the squared correlations under consideration cannot be qualitatively compared. 

The above examples prove that no conditions in Theorem [3] can be relaxed. 


6 . Discussion 


Qualitative comparison may be possible under other sets of conditional independence relations. The 
requirement of a single component x cannot be relaxed. The results in Section [2] are sufficient for postulating 
path based rules for comparison on polytree m odels as we ll. Since the edg es on a polytree are directed, these 
rules are more involved than those for trees ( Cliaudliuri and Richardsonl 2003h . 

Comparison of mutual information with a fixed conditionate holds for any distribution. In fact, the 
results with fixed correlates are based on the positive-definiteness of the covariance matrix and extend to 
non-Gaussian distributions as well. However, inequalities for squared partial correlation would not translate 
to mutual information for such random variables. These results may be applicable to causal model selections 
among non-Gaussian variables (eg. IShimizu et all (2006)). 

It can be shown that, although the comparisons with a fixed conditionate do not hold, but absolute values 
of parti al regressio n c oeffici ents can be qualitatively compared for fixed correlates under the same conditions 
( Chaudhuri and Thni . 12010 ). 

Rules for signed comparisons of partial correlation and regression coefficients can b e devel o ped from t hese 
resul t s. Such resu l ts might be useful in ident i fying hidden variable s in Factor models (IBekker and de Leeuw , 
1987 : Drton et~a3 . 12007 : Spirtes et all . 2000[ Xu and Pearl . 1989lf and in recove ring population covariance 
matrix for one-factor models in presence of selection bias ( Kuroki and Gail . 2006lf . 


Appendix A. Proofs 

Notation: For two real numbers a and 6 , a oc + b implies that, 3 M > 0 such that a = M ■ b. 
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(c) 


(d) 


Figure 10. The directed acyclic graph described in Example 1. Clearly there is no ordering 
between p 2 ac , p 2 ac ^ and p\ cW 


Proposition 1. Suppose U, V, W are univariate components of a Gaussian random vector with mean p 
and positive definite covariance £. Assume that U _LL V\W. Then auv = &uw a wv/&ww and auu = 
Vuw a ww/<?ww + E War (U\W)\. 

Proof. Trivial. □ 

Suppose K and K' are constants and for some a,c,d&V and B C V \ {a, c, d} (where B may be empty) 
we denote Mi = a cd \ B {v a d\B&cc\B ~ &ac\B&cd\B}, 
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Pz,C-0> Pz 2 C-0 



(c) (d) 

Figure 11. The directed acyclic graph described in Example 1. Clearly there is no ordering 
between p 2 ac , p 2 ac ^ and p 2 ^. 

M2 = O a d\B {Vcd\B&aa\B — f7 ac\B&ad\B }j M 3 (a) = [(« — K')a ac \B&dd\B — & • <J ad \B^cd\ B \ and 

r( ) — ~ K')Pac\B - Kp ad \ B Pcd \ B } 2 

[ W - [{(a - K>) - Kp 2 adlB }{(a K') - Kp 2 cd]B }} ' 

Lemma 2. Suppose K > 0 and for some K' and a, {a — K') — Kp 2 ad ^ B > 0 and (a — K') — Kp 2 cd \ B > 0. 
Then if Mi ■ Mi > 0 : 

(1) dL frf > = 0 if both M\ ■ M 3 (a) and Mi • M 3 (a) are 0. 

(2) fi as the same sign as either Mi ■ M 3 (a) or M 2 ■ M 3 (a), whichever is non-zero. 
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Proof. Since the denominator of (0 is positive then the sign of dL(a)/da is the sign of the numerator of 
dL{a) / da. From quotient rule of differentiation and some algebraic manipulation we get : 


dL(a) 

da 


oc + K[(a - K')p ac]B - Kp ad \ B Pcd\B\x 


{[ 


[(« - K') - Kpl d \ B \p cd \ B [p ad \B ~ Pac\BPcd\B] 

(7) + [(a — K') — Kp 2 cd \ B ]p ad \ B [p cd \ B — Pac\BPad\ B \ ■ 

Note that p c d\ B [Pad\B-Pac\BPcd\ B \ oc + Ml, p ad \ B [Pcd\ B -Pac\BPad\ B \ oc + M 2 and [(a-K')p ac \ B -Kp ad \ B p cd \ B ] OC+ 
M 3 (a). By substituting these expressions in © and the positivity K , [(a — K') — Kp 2 ad j B ] and [(a — K') — 


Kp 2 d \ B \ the result follows. 


□ 


□ 


Proof of Theorem [T| From the assumption Inf {a _LL c'\cZ) = 0. The rest follows from the identity 
Inf (a _LL c'\cZ) + Inf (a _LL c\Z) = Inf (a _LL c\c'Z) + Inf (a _LL c'|Z)Q □ 

Note that, from lLnenicka and Matug ( 2007tl . assumptions on conditional independence and the conditional 
correlations do not change if we replace £ by </£</, where J is the diagonal matrix with 1/^/a vv , v € V. 
Thus, unless otherwise stated, w.l.g we can assume that the diagonal elements of £ are all equal to 1 and 
all the off diagonals are in (—1,1). That is £ is the correlation matrix of V, but with an abuse of notation 
in what follows below, we still denote the correlation of a and c by a ac - 

Proof of Theorem © Note that by assumption a ac = a ax a cx , o az = a ax a xz , cr cz = a cx a xz , a az f = 
Gn. x G xz G zz ' and g 

CZ' — &CX&XZ&ZZ' 


7 ax u xz^ 7 zz' 

Pait L Plc\z 
„2 


= ~~ *£,)/[( 1 - < x g 2 xz ){1 - v 2 cx g 2 xz )]. Now since < viz and a 2 cx a 2 xz < a 2 xz , 


Pac\z < Pa 

Part 2. Assume that x ^ z' and consider three non trivial cases as x = a, x = c and x (jL {a,c}. Initially 
assume that a zz i ^ 0. Since ac _LL z'\z, using Proposition [T] and the positive definiteness of the covariance 
matrix together with t 2 = (1 — a 2 zz ,) > 0 and by denoting a = 1 + ( t 2 /g zz ,) > 1, with B = 0, K' = 0, 
K = 1 it follows that p 2 ac \ z , = L(a) for a > 1 and p 2 ac \ z = L( 1). Thus in Lemma [2] using Cauchy Schwartz 
inequality and a > 1 it follows that for x = a, M\ oc + a cx , M 2 = 0 and M 3 (a) oc + a CXl for x = c, M\ = 0, 
M 2 oc+ a ax and M 3 (a) oc + a ax and for x $. {a,c}, Mi oc + a ax a cx , M 2 oc+ a ax a cx and M 3 (a) oc + a cx a ax - 
Thus for all cases dL/da > 0 and the result follows. If g zz i = 0, 2 _LL z' and z' _LL acz. Thus p 2 c i z , = Po C - 
The rest follows from part 1. 

For the second inequality notice that, by our assumption a az i = G az <x zz ' = cr ax G xz G zz i. Since we don’t 
assume x _LL z'\z, a xz a zz r is not necessarily equal to a xz i. However, p 2 ac{z , = a 2 c ( 1 - g 2 z g 2 z ,) 2 /[(1 - 

vlx°i z °iz-){ 1 ~ vl x o 2 xz u 2 zz ,)\ < p 2 ac in the same way as in part 1 . □ 

Proof of Theorem [3j By assumption zB _LL ac\x and a _LL c. 

' ac\Bz — 

denoting Q 1 = T, xB 'E B 1 B ’E Bx and Q 2 = (T, xB ,g xz ) ^ B z)( B z ) one gets a ac \ Bz = -a ax a cx Q 2 and 

&ac\ B = -VaxVcxQi- Now the proof follows by noting that, a aa - &l x Qi = & aa \ B > v aa \ Bz = &aa ~ <?l x Q 2 
implies Q 2 > Q\. 

Part 2. We initially assume that a zz ' ^ 0. By defining t 2 = (l — a zz ,) > 0, a = (l + (t 2 /a zz ,)), 
K' = E zB 'E bb 'E Bz > 0, K = (1 — K') > 0 and from the assumption that z' _LL acB\z it follows that 
p\c\ B z' = L(a) with a > 1 and p 2 ac \ Bz = L( 1). Further using ac _LL zB\x one can show that M\ oc + 
o’cxO’ax a l z |b, M 2 oc+ G cx a ax a 2 ^\ B and M 3 (a) oc + —a cx a ax . Thus from Lemma[2]it follows that dL/da < 0. 
If cr zz i = 0, as before z _LL z' and z' _LL acB. Thus p 2 ac \ Bz = P 2 ac \ B ■ The result follows from part 1. 

For the first inequality, notice that a ac \ Bz i = —a ax a cx Q 2 and a aa \ Bz i = 1 — Ga X Q 2 , where Q\ = 
(^xB,o- xz a zz r) Z/J z , )(Bz ,) (£ lBl g xz g zz ') T . This implies a 2 ac \ Bz , > cr^ c | B just like part 1 above. □ 

Proof of Theorem [4j W.l.g. it is enough assume that x ^ B. Furthermore, note that cr aa \B > G aa\Bz an d 
a cc\b ^ a cc\Bzi thus for part 1 it is enough to show that under the assumptions (J ac \Bz = m * a ac\B f° r some 


Part 1. It is enough to show that o 2 ac \ Bz > G 2 ac ^ B . Using the above relations in Proposition [T| and by 


^The author would like to thank the referee for drawing his attention to this equality which improved the original proof 
immensely. 
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m > 1 . 


Part 1. Assume that, a _LL z and let (ii) hold, ie. cB _LL az\x. Using Proposition |T] it follows that 


a ac\Bz — Pac\B + 


Pac\B T 


{'^‘aB^‘ BB ^Bz){Pcz ~ ^cB^ BB ^Bz) 
Pzz\B 

^xzQlfacx&ax PcxPaxQ l) 

Pzz\B 


— PacIS + 


PaxP X zQl(.Pcx PcxQl) 


Pzz\B 


®ac\B 


(i + a lxQi a , 


zz\BJ ' 


Thus Plc\B ^ Plc\Bz- Under (i) if c _LL az, a ac = a zc = 0, a ac \ B = -E aB E BB E Bc and a cz \ B = -E cB E BB E B2 . 
Now if (i)(a) ie. az _LL B\x holds: 


( 8 ) 


&ac\Bz ®ac\B 


®ac\B 


(E oB E B ^EB i )(E cB E B ^E Bz ) 

a zz\B 

{PaxPxzQl) {^-‘cB^BB^BxPxz') 


a ac\B + a zxQ^ a zz\B) ’ 


Pzz\B 

Under ie. az _LL B\cx notice that from Proposition [1] 

U aB ^a(xc)^ (xc)(xc)E(xc)i? \Pax: ^ xc ^ xc ^^^xc)B Pax [1; ^\^^xc){xc)^(xc)B PaxQcxB- 

Here Q cxB = [ 1 , 0]E^. 1 c)(a . c) E( sc)B . Similarly it can be shown that, T, zB = PzxQcxB andcr ac | B = -a oa .Q ca;B E BB E Bc . 
Now by substitution in © above we get: 


&ac\Bz &ac\B 
®ac\B 
= Pac\B + 


Pgx{QcxB'B BB Qctb)(UcbU BB^cxb) 0 zx 


a zz\B 

(p zxQcxB^ BB&cxBP zx)(JP‘cb'£‘ BBQ^BPax) 


Pzz\B 


(U zB E BB E B2 )(7 ac | B 

P zz\B 


p ac \B jl + (E, B E B ^E Ba )o- z 4 B } . 


The proofs for (i)(c) and (i)(d) are similar. 

If (i)(e) ie. ac _LL B\x holds, <r ac \ B = —PaxPcxQ\ and using Proposition Q] we get, 


&ac\Bz &ac\B 


(—U aB E BB E B 2 )(—E cB E bb E B z ) 


— Pac\B PaxPcxi^xB^BB^Bz) P Z z\B 


P zz\B 

= a ac\B + (Ua: B E BB E B2 ) /(Ql cr 22 | B )| ■ 

Undei condition (?')(/) notice that, S aB ^a(x 2 )E^ ;r 2 ^, z ,, 2 ^E ( X z)b — ty^ l (xz)(xz)^ I (.xz)B — PaxQxzB~ 

Similarly, E cB = <t cx Q X zB- Now from Q it follows that: 


&ac\Bz ®ac\B 


&ax&cx(QxzB^ BB^Bz) 
&zz\B 


Clearly if at least one of cr ax ,(j cx > QxzB is zero, the results is trivial. Now suppose none of them equal zero. 
Then QxzB^bbQxzB > Further cr ac \ B = -<J ax P C x{QxzB'£‘ B 1 B Q? xzB ), which yields 


&ac\Bz ®ac\B ^ 1 ~h 


(QxzB^B^^Bz) 2 

{QxzB^b b Qxzb)Pzz\B 


Part 2. Suppose a z , z > 0. Let t 2 , = (l — p z , z ) > 0, K’ = E 2 B E BB E B2 , K = (1 — K’) > 0 and 
« = 1 /pIz = (! + T z ,/a z , z ) > 1. Then from acB _LL z'\z, a _LL zz' it follows that for both cases p\ 

L (a) with a > 1 and p 2 ac , B , = L(l). Now we consider the four cases in the statement. By denoting 


,2 

ac\Bz' 
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bb^Bx-) Qi 

ix — ^aB^BB^Bx 

and QaxB — [ 1 , OjE^^ 

)^(xa)B it follows that: 


&axQcx 

if (i), (a) 

&axQcx 

if (i), (a) 


Oax QcxB^BB^Bc 

if (i), ( 6 ) 

&ax QcxB^BB^Bc 

if (i), ( b ) 


&cxQax 

if (i), (c) 

&cxQax 

if (i), (c) 

Mi oc + M 2 oc + < 

Ocx QaxB^BB^Ba 

if ( i), (d) ,M 3 (a ) oc + < 

&cxQaxB^ BB^Ba 

if ( i ), (d) 


&CLX&CX 

if (i), (e) 

&CLX&CX 

if (i), (e) 


&CLX&CX 

if (i), (/) 

&CLX&CX 

if (*)> (/) 


< &CLX&CX 

if (ii) 

< &ax&cx 

if (ii) 


Thus from Lemma O in all cases dL/da < 0, which completes the proof. 

If a zz ' = 0, then for all cases p 2 ac \ Bz , = p 2 c \ B and the result follows from Part 1 as before. 

□ 

Proof of Corollary [l] If B = 0, under ( i ) from the assumed independence of a, c and z, we get a ac = 
< 7 az = &cz = 0. The result follows from this. Under (ii), a cz 7 ^ 0 and from Theorem 0] the result follows. □ 
Proof of Theorem [5j In this proof we take £ to be the covariance matrix and not the correlation matrix 
as above. Using condition B _LL acz\x, denoting a xx Q 4 = T = <j zz / ( a zz - v 2 xz Qa) (T > 0) 

and from Proposition |T] and some simplification we get 


plc\Bz __ (<JaafTxxQ 4 :T - al x Q 4 T) (<J CC <?xxQ 4 T - <j 2 cx Q 4 T) 

P 2 ac\ x {°aa ~ (tI x Q 4 T) (t7 cc - CT 2 cx Q 4 T) 

Thus p 2 c \ Bz > P 2 ac \ x iff ctxxQaT > 1 iff (cr xx + cr xz /a zz ) Q 4 > 1. The equivalent expression follows as: 


-£ xbZ^Zbx > O 


T, zB 'E R 1 R Y. Bz > 


7 xx\B 


J BB 




7 xx\B 


Tzz|.B 

O’zz 


□ 

Proof of Lemma |lj We need to show that if Ti, T 2 and T 3 , are disjoint subsets of Z c , then T\ is connected 
to T 2 given T 3 in Gz c iff Ti is connected to T 2 given T 3 U Z in G. 

(=») Suppose Ti is connected to T 2 given T 3 in Gz <=■ So there are t-\ £ and t 2 £ T 2 and the path tl 7T i2 
such that 7T fl T 3 = 0. Clearly t 1 7 ^t 2 i Q ^ and ti^t 2 C % = 0- So ti^t 2 fl {T 3 U2} = 0. This shows Ti is 
connected to T 2 given T 3 U Z in G. 

(<=) Suppose Ti is connected to T 2 given T 3 UZ in G. So there is ti £ T\ and t 2 £ T 2 and the path t 7T t , 
such that t 1 ' K t 2 ^{^3 U Z} = 0. So tl 7T t2 C Z = 0 and tl 7Tt 2 — % c - Clearly in G^e, tl 7Tj 2 CT 3 =$. This shows 
Ti is connected to T 2 given T 3 in Gz c ■ □ 

Proof of Theorem [BJ From the structure of G and since c £ a 7T c ', it easily follows that d is separated 
from a given c and Z. The result follows from Theorem [T] □ 

Proof of Theorem [3 For notational convenience we express the squared partial correlations as functions 
of the covariance matrix E. We need to show that p 2 c \ Zi (E) < p 2 ac \z 2 (E). W.l.g. we assume that for i = 1,2 
there is no z l £ Z t such that ac _LL Zi\{zi}\zi- We consider several cases below: 

Case 1. If Zi fl a 7T c 7 ^ 0, then a _LL c|Zi, p 2 c \ Zl = 0 and the result is trivial. 

We initially assume that Z\ separates Z 2 from a and c. This implies that for each z 2 £ Z 2 there is a 
Zi £ Zi and z\ £ Zi such that z x £ a 7r 2 and z\ £ C 7T, 2 . 

Case 2. If Z 2 fl a 7T c 7 ^ 0, then zf £ a 7T 22 C a 7T c . This implies that a _LL c\Z\ and p 2 c \ Zl = 0. 

Case 3. Now let (Z x U Z 2 ) fl a 7T c = 0. Suppose Z x = {zn, z X2 , . ■., Zi ni } and Z 2 = {z 2 i, z 22 , ■ ■ 
Z2n 2 }- Suppose Xi = a TT Zli D c^zu b 1 a^c- Since G is a tree Xi is unique for z t . Also suppose that N % = 
{ z 2 i £ Z 2 : Zu £ a 7 T , 2 . n C 7T Z }. Again from the structure of G it is clear that are disjoint and Z 2 = 
U ‘i= 1 N i . We don’t exclude the possibility that may be 0 for some i. Using m we can write: 


n 2 Til 

Pac\Zt yi - ^ac|zn...zi(i_i)ZiiAf i+1 ...iV ni 

O 5 ~ ii ~0 1 

"ac|Z 2 i —1 "ac\zii...zi(i-i)N i N i+1 ...N ni 
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(9) 















It is sufficient to show that each factor in the product © is bounded by 1. Consider the ith factor, 


fi = 


Pac\zii...z 1 ( i -i)Zi i N i+1 ...N ni 


Pac\zii...zi( i - 1 - ) N i N i+1 ...N ni 

Notice that the factor /,; depends only on the subgraph Gy i of G defined by the vertex set: 


Vi = 



^zij U C^Zy, 


uuu (« 


7TJj) U „7T~Cj) 


^4’eiv. 


< '2k 


*2k , 


r Bi 


It is clear that, Gy i is a tree. Let us denote Bi = {zn,..., Zi(,_i)} U (u"C +1 1\L) and B? = \4 \ Bi. 

Now from the structure of Gy i we note that (i) Xi G a 7T c so a _LL c\xiBi, (ii) Xi G Q 7T 2li and x i G cE zy 

implying a _LL Zy\xy Bi and (iii) Zy G M (o v ( a 7T~(») D C 7T_(0 ) it follows that ac _LL NAzyBi. 

Z 2k^ iy/ i \ 2 fc 2k J 

From Lemma |T| it follows that the triples (a, c | Xi), (ac,zy | xS) and ( ac,Ni | zi*) are in 3 (G) 
3 (G B c). It is obvious that, 

Pac\B i z li ( S ) _ P 2 ac\zy 

P ac\BiNi (^) Pac\Ni 

Now consider the following sub-cases: 

a. If AC = 0 or N^ = z 2 i, from the Theorem[5]it follows that p 2 ac \ Zli (E B c B c| B .) < p 2 ac \ N . (E B c B c| B .). 

b. If Nj = {z 2 i,..., Z 2 mi}i then using ac _LL N^zy, we can write: 


fi = 


P ac\ 


Z 1 iZ22---Z2rr 




Plc\z 21 Z22-Z2 mi 

By following the same argument as above and conditioning on {Z 22 , • •., Z 2 TOi } it follows that /» < 1. 
Now suppose that there is a Z' 2 C Z 2 s.t. Z 2 is not separated from a and c by Zi, but because of the 
choice of parameters both p 2 aZ '\z 1 = p 2 c z'\z 1 = *-*• 

It can be shown that *£ c|(z , UZl) = pl c \ Z y So if ^2 n a TT c ^ 0 then p 2 ^ = p 2 ac |(z , UZi) = p 2 ac , Zl = 0. On 
the other hand if Z 2 D a 7T c = 0 we can write: 


( 10 ) 


P ac\(Z' 2 \JZi) 


P ac\Zi 

2 2 ' 
Pac\Z 2 P ac\(Z' 2 U{Z 2 \Z' 2 }) 


The fact that the ratio in m is less than 1 follows from the first part mutatis mutandis. 


□ 


Proof of Corollary [2l The assumptions imply that Z\ separates Z 2 from a and c. This is exactly Case 3. 
in the previous proof. □ 

Proof of Theorem [ 8 j We parametrise the Choleski decomposition A = BB T . 

Suppose Zi G Z\ and Z 2 G Z 2 such that ac /L Z\\Z 2 and ac JL z 2 \Zi. Let a 7T c = {a = v\, v 2 , ■ ■ ■, va = c}, 
a 7T c n a T^ zl n c 7T 21 i>i, a 7T c n a 7T Z2 n C 7T Z2 Vj, i,j G { 1 ,2,..., d }. Further let V /K Z1 { 1 H 1 x h • • ■ > x di zi} 
and „.7T 22 = {vj,y 1 ,... ,y d2 = Z 2 }. If i = j it is possible that v .Tt zl and V .TT Z2 intersect at more than one 
vertex. However, it does not change the proof, so w.l.g. we assume that i ^ j. Suppose 

Vi aE c G Vi^zi G vj^Z 2 

El = {(v2,vi),...,{v d ,v d - 1 ),(x 1 ,Vi),...,(z 1 ,x dl - 1 ),(y 1 ,v j ),...,(z 2 ,yd 2 -i)}- 

We list the variables in E as a 7l c , V1 TT Z , V2 ^ Z2 , V \ Vi, where the vertices in V \ Vj can be arranged in an 
arbitrary fashion. The matrix B inherits the same arrangement. 

The matrix B is given by, B u = 1, {if k = l}, B ki = -1, {if (M) G #/}, = -by {if {k,l) = (zi,x dl -i)}, 

Bid = ~b 2 , {if (M) = (z 2 ,x d2 - 1 )}, B k i = 0 , {otherwise}. 

It can be shown that the resulting A is a n.n.d. matrix for all values of b\ and b 2 and will represent all 
the conditional independence relations on the tree under consideration. 
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Now choose b\ = 0. This implies p 2 ac \ Zl = p 2 c > P 2 ac \ Z2 = Pa C \z 2 - The °PP 0S ite happens if b 2 = 0. This 
completes the proof. □ 

Proof of Corollary [3j The result is trivial if a 7T c i D Z' ^ 0. Furthermore, by assumption if Z intersects 
a 7T c , so does Z'. The non-trivial case can be shown by applying Theorem [G] and Corollary [2] respectively on 
the factors below: 

2 2 2 

Pac'\Z' _ Pac'\Z' Pac\Z' 

2 2 2 ' 

Pac\Z P ac\Z' Pac\Z 

□ 

To prove Theorem [9] we need the following definitions from the literature of directed acyclic graphs. 

Definition 3. A vertex v on a path x TT y in a polytree is a collider on the path if there are vertices v\ and v 2 
on x TT y such that the edges v\ —> v and v 2 —> v exist. A vertex on a path x TT y in a polytree is a non-collider 
on the path if it is not a collider on x 7T y . 

Definition 4 (d-connection) . A path x TT y between x and y in a DAG is said to be d-connecting given a set 
Z (possibly empty) if 1. every non-collider on x TT y is not in Z and 2. every collider on x 7l y is in an(Z). 
Here an(Z) = U zeZ an(z). 

If there is no path d-connecting x and y given Z, then x and y are said to be d-separated given Z. 

Definition 5. For disjoint sets X, Y, Z, where Z may be empty, X and Y are d-separated given Z , if for 
every pair x, y, with x £ X and y £ Y, x and y are d-separated given Z. 

Definition 6. We say a density f factors according to a DAG, if for three disjoint sets X, Y and Z, 
X _LL Y\Z according to f whenever X is d-separated from Y given Z. 


a 




Z31 -*--* c 


^ *21 

Jt" 

f 

233 

y 

^34 


222 


b 


Figure 12. Example of a polytree discussed in Theorem[9] Vertices z\\ and z\ 2 are relevant 
to Case (i), z 2 i and 222 are relevant to Case (ii) below. The vertices Z 3 k, for k = 1,..., 4 
corresponds to Case (in) in the proof below. 

Proof of Theorem [9j First of all note that, since p 2 ac , bz 7 ^ p 2 ac | b , ac fL z \ b. Further, since a 7 ^ c 7 ^ b, 
a £ an(c) and c £ an(b ), there are no colliders on a 7T^ . We first show that z _LL ac iff z _LL c. Clearly, z AL ac 
implies z _LL c. To show the converse first note that, since the graph is a polytree, ii z AL c there is at least 
one collider v on the unique path c 7 T z between c and z. Clearly, v cannot be on a 7T c , otherwise it will be 
a collider on a 7T c . However, by construction c 7 T z \ a TT c = ( a 7T z (~l C 7T Z ) \ a TT c . So if v is not on a 7T c , v would 
be a collider on a 7T z as well. Thus, using the assumption that the graph is a polytree, a AL z and our claim 
follows. 

Similar argument shows if z _LL b iff z _LL acb. So, p 2 ac \ bz 7 ^ p 2 ac \ b implies that z JL b. So only the following 
three cases, (i) a LA z and c fL z, (ii) ac AL z (ie. z AL c) and (in) a fL z and c fL z are possible. We first 
consider the if parts: 

Case (i) We show that there is a vertex vi such that az _LL cb\v\. a AL z implies there is at least one collider 
v\ on a 7T, , a/z/q. Again by construction c 7T Vl \ a TT c = ( a 7T v D C 7T„ ) \ a TT c . Thus, if v± £ a 7T c , v\ is a 
collider on c n z as well, which would imply c AL z. Thus v\ £ a 7T c . Clearly, vi cannot be a collider on a 7T c . 
Thus vi is the only collider on a 7T z and it is not a collider on a 7T c and C 7T Z . Thus, from the definition of 
d-separation it follows thataz _LL cb\v\. From Theorem H] (ii) it follows that p 2 c \ bz > p 2 ac \ b - 
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Case (ii) We show that a _LL z\cb and apply Theorem [3] with x = c. Since by assumption c _LL z and b JtL z, 
as in Case ( i ) above there is a vertex r 2 £ c 7r b such that u 2 is a collider on C TT Z but not a collider on b n z . 
Note that, u 2 / cor u 2 ^ z. Thus c is a non-collider on both a ir z and a TT b and c d-separates a from {b, z}. 
This implies a _LL bz\c, which in turn gives a _LL z\cb. Now from Theorem [3] we get p 2 ac \ b „ < P 2 ac \ b - 
Case (m) Since a fL z, it follows that c fL z and b JL z. This implies there is no collider on a 7 T z , c 7 T z 
and b l r 2 . Let V 3 = n 7T z Q C 7T. _Q b TT z fl a Tl b . Clearly, V 3 is a non-collider on all these paths. So, it follows 
that acb _LL z\v 3 (jLau ritze n. Il996l page 29). This implies ac _LL z\bv 3 . Further, if V 3 £ a 7T c , a _LL cb\v 3 and 
a _LL c\bi’ 3 . It is possible that z = V3. Now if V3 £ a 1 T c , Theorem [3] with x = V3 imply p 2 ac \ bz < p 2 ac \ b ■ Note 
that in this case if v 3 = z, p 2 ac \ bz = 0. If V 3 ^ a 7T c , we consider two cases. Case (a) z = V 3 £ c 7T b . Clearly 
ac _LL b\z. Now using Theorem [3] we get p 2 c \ bz = P 2 ac \ z < P 2 ac \b- Case (b) When V 3 ^ c TT b use Theorem[3]on 
conditional covariance given b with x = c to get p 2 ac \ bz < p 2 ac | b - 

The only if parts follow from the if part and the fact that the above three are only possible cases under 
our assumptions. 

□ 


Appendix B. Mixed ancestral graphs 

In t his supplement we briefly discuss mixed ancestral graphs. Our discussion closely follows lRicliardson and Spirtes 
(2002). We also refer to the same text for a more detailed treatment of the class of these graphs. 

A graph G is an ordered pair ( V,E ) where V is a set of vertices and E is a set of edges. 

A mixed graph is a graph containing three types of edges, undirected ( — ), directed (—») and bidirected 
(■£>). The following terminology is used to describe relations between variables in such a graph: 

(1) If a — /I in G, then a is a neighbour of j3 and a £ ne(/3). 

(2) If a — > [3 in G, then a is a parent of /? and a £ pa(/3). 

(3) If j3 —t a in G, then a is a child of /? and a £ ch(j3). 

(4) If a t-A 0 in G, then a is a spouse of (3 and a £ sp(f3). 

Definition 7. A vertex a is said to be an ancestor of a vertex /3 if either there is a directed path a —>•••—> 0 
from a to 0, or a = 0. Further, for X C V its ancestor set is defined as: 

an(X) = {a : a is an ancestor of 0 for some 0 £ X}. 

Definition 8 . A vertex a is said to be anterior to a vertex 0 if there is a path a TTp on which every edge is 
either of the form 7 — 6, or 7 -A <5 with S between 7 and 0, or a = 0; that is, there are no edges 7 -fA 5 and 
there are no edges 5 —> 7 pointing toward a. Further, for X C V its anterior set is defined as: 

ant(X) = {a : a is an anterior to 0 for some 0 £ X}. 

Definition 9. An ancestral graph G is a mixed graph in which the following conditions hold for all vertices 
a in G: 

(1) a ant ( pa(a ) U sp(a)) and 

(2) if ne(a) ^ 0 then pa(a) U sp(a) = 0 . 

The d-separation criterion for DAGs can be extended to m-separation criterion for mixed ancestral graphs. 

A non-endpoint vertex (ona path is a collider on the path if the edges preceding and succeeding £ on 
the path have an arrowhead at £, ie., —> £ t—, -fA ( -H-, —► ( -f-K A non-endpoint vertex ( on a path 

which is not a collider is a noncollider on the path. 

A path between vertices a and 0 in an ancestral graph G is said to be m-connecting given a set Z (possibly 
empty), with a, 0 ^ Z if: 

( 1 ) every noncollider on the path is not in Z , and 

(2) every collider on the path is in the ant(Z). 

If there is no path m-connecting a and 0 given Z, then a and 0 are said to be m-separated given Z. Non 
empty sets X and Y are m-separated given Z, if for every pair a, 0 with a £ X and 0 £ Y , a and 0 are 
m-separated given Z (X, Y and Z are disjoint sets). 
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Figure 13. 


A distribution F is said to satisfy the conditional independence relations represented by a mixed ancestral 
graph if for disjoint subsets X, Y and Z, X _LL Y\Z according to F whenever X is m-separated from Y given 

Z. 


B.l. Examples of mixed ancestral graphs in the main text. 


Example 1. Consider the Mixed ancestral graph in Figure 13(a) There are more than one paths connecting 
a and c. Each of them has a collider on it. As for example, y 1 is a collider on the path {a, yi,c\. So a is 
m-separated from c given 0. Thus a _LL c. Further note that, X 2 is a noncollider on each path connecting 
{a, c} and z. Thus, ac _LL z\x 2 - Similarly, ac _LL z'\z. 


Example 2. Now we consider the graph in Figure \l3(b)\ Clearly a _LL c. x is a collider on the paths {a, x, z } 
and {c,x,zj. Further, b is a collider on the paths {a,x,b,zj and {c,x,b,z}. So b and x m-separates a and 
c from z given 0. So ac AL z. Now note that, x is a noncollider on the paths {a, x, b} and {c, x, b}. Also z is 
a collider on the paths {a, x, z, 6 } and {c, x, z, b}. This implies {a, c} is m-separated from b given x, but not 
given zx. 


Acknowledgement The author would like to thank Michael Perlman, Thomas Richardson, Mathias 
Drton, Antar Bandyopadhyay, the referees and the associate editor for their useful comments and suggestions 
during the preparation of this article. 


References 

Bekker PA, de Leeuw J (1987) The rank of reduced dispersion matrices. Psychometrika 52:125-135 
Chaudhuri S (2005) Using the structure of d-connecting paths as a qualitative measure of the strength of 
dependence. PhD thesis, Department of Statistics, University of Washington, Seattle. 

Chaudhuri S (2013) Qualitative inequalities for squared partial correlations of a gaussian random vector. 

Tech. Rep. 1/2013, Department of Statistics and Applied Probability, National University of Singapore 
Chaudhuri S, Richardson TS (2003) Using the structure of d-connecting paths as a qualitative measure of 
the strength of dependence. In: Proceedings of the Nineteenth Conference Conference on Uncertainty in 
Artificial Intelligence, Morgan Kaufmann, San Francisco, CA, pp 116-123 
Chaudhuri S, Tan GL (2010) On qualitative comparison of partial regression coefficients for gaussian graph¬ 
ical markov models. In: Viana MAG, Wynn HP (eds) Algebraic methods in Statistics and Probability 
II, Contemporary Mathematics, vol 516, Providence, Rhode Island: American Mathematical Society, pp 
125-133 

Cheng J, Greiner R, Kelly J, Bell D, Liu W (2002) Learning bayesian networks from data: an information- 
theory based approach. Artificial Intelligence 137:43-90 
Chickering D, Meek C (2006) On the compatibility of faithfulness and monotone dag faithfulness. Artificial 
Intelligence 170:653-666 

Cover T, Thomas J (2006) Elements of Information Theory. Hoboken, New Jersy: John Wiley & Sons, Inc 


20 







Drton M, Strumfels B, Sullivant S (2007) Algebraic factor analysis: Tetrads, pentads and beyond. Probability 
and related fields 138:463-493 

Greenland S (2003) Quantifying biases in causal models: classical confounding versus collider-stratification 
bias. Epidemiology 14:300-306 

Greenland S, Pearl J (2011) Adjustments and their consequences-collapsibility analysis using graphical mod¬ 
els. International Statistical Review 79(3):401-426 

Jarvie HP, Colin N, A WPJ, Wescott, Chris, Acornley RM (2005) Nutrient hydrochemistry for a 
groundwater-dominated catchment: The hampshire avon, uk. Science of The Total Environment pp 143— 
158 

Kuroki M, Cai Z (2006) On recovering a population covariance matrix in the presence of selection bias. 
Biometrika 93(3):601-611 

Lauritzen S (1996) Graphical Models. Oxford: Oxford University Press, Inc 

Lnenicka R, Matus F (2007) On gaussian conditional independence structures. Kybernetika 43(3):327-342 
Matus F (2005) Conditional independence in gaussian vectors and rings of polynomials. In: Kern-Isberner 
G, Rodder W, Kulmann F (eds) Conditionals, Information, and Inference (WCII 2002 Hagen), Berlin 
Heidelberg: Springer, pp 152-161 

Matus F (2006) Piecewise linear conditional information inequality. IEEE Transaction on Information Theory 
52(l):236-238 

Matus F (2007) Infinitely many information inequalities. In: Information Theory, 2007. ISIT 2007. IEEE 
International Symposium on, pp 41-44 

Richardson T, Spirtes P (2002) Ancestral graph rnarkov models. The Annals of Statistics 30(4):962-1030 
Roberts GO, Sahu SK (1997) Updating schemes, correlation structure, blocking and parameterization for 
the Gibbs sampler. Journal of the Royal Statistical Society Series B Methodological 59(2):291-317 
Rodriguez-Iturbe I, Rinaldo A (2001) Fractal River Basins chance and self-organisation. Cambridge: Cam¬ 
bridge University Press 

Shimizu S, Hoyer PO, Hyvarinen A, Kerminen A (2006) A linear non-gaussian acyclic model for causal 
discovery. Journal of Machine Learning Research 7:2003-2030 
Spirtes P, Glymour C, Schemes R (2000) Causation, Prediction, and Search. Cambridge, Massachusetts: Mit 
Press 

VanderWeele TJ, Robins JM (2007) Directed acyclic graphs, sufficient causes, and the properties of con¬ 
founding on a common effect. American Journal of Epidemiology 166(9):1096-1104 
VanderWeele TJ, Robins JM (2010) Signed directed acyclic graphs for causal inference. Journal of the Royal 
Statistical Society: Series B Methodological 72(1):111 127 
Verrna T, Pearl J (1990) Equivalence and synthesis of causal models. In: Bonissone P, Henrion M, Kanal L, 
Lemmer J (eds) Proceedings of the Sixth Conference Conference on Uncertainty in Artificial Intelligence, 
AUAI Press, Corvallis: Oregon, pp 220-227 

Wermuth N, Cox DR (2008) Distortion of effects caused by indirect confounding. Biometrika 95(1):17 33 
Whittaker J (2008) Graphical Models in Applied Multivariate Statistics. Chichester: John Wiley & Sons, 
Inc 

Xu L, Pearl J (1989) Structuring causal tree models with continuous variables. In: Henrion M, Shachter 
R, Kanal L, Lemmer J (eds) Proceedings of the Fifth Conference Conference on Uncertainty in Artificial 
Intelligence, AUAI Press, Corvallis, Oregon, pp 170-178 
Zhang Z, Yeung RW (1997) A non-shannon-type conditional inequality of information quantities. IEEE 
Transaction on Information Theory 43(6):1982-1986 

Department of Statistics and Applied probability, National University of Singapore, Singapore, 117546. 

E-mail address: sanjayOstat.mis.edu.sg 


21 



